Speech-noise discriminator



/N VEN To@ @y M. .SCHROEDER A 7` TOR/VE V United States Patent O3,507,999 SPEECH-N OISE DISCRIMINATOR Manfred R. Schroeder, Gillette,NJ., assignor to Bell Telephone Laboratories, Incorporated, Murray Hill,NJ., a corporation of New York Filed Dec. 20, 1967, Ser. No. 692,230Int. Cl. G] 1/02 U.S. Cl. 179-1 9 Claims ABSTRACT OF THE DISCLOSURE Thepresence of speech is detected in a speech-noise signal by examiningsignal components in a plurality of contiguous spectral regions forcommon periodicity. By comparing a signal representative of the degreeof common periodicity between such spectral components with a referencesignal proportional to the power of the received signal difficultiescaused by relative variations in the noise level and in the frequency ofthe speech or noise signals are eliminated.

This invention relates to wave analyzing apparatus, and morespecifically to apparatus for identifying the presence of a speech wavein an ambient noise environment.

BACKGROUND OF THE INVENTION Communication signals in general and speechsignals in particular are frequently mixed with extraneous noise whichmay mask the desired signal. Such noise may originate in thecommunication system itself or may be generated by unwanted sources atthe transmission point. In either situation, the noise is likely to varyin both amplitude and frequency and may occupy the same frequencyspectrum as the desired voice signal thereby making the identificationof a speech signal particularly difficult.

There are several important features which distinguish a speech signalfrom noise. As is well-known, normal speech signals are composed ofvoiced and unvoiced speech elements. Voiced speech elements have anamplitude spectrum composed of a number of individual frequencycomponents of various amplitudes which occur at harmonics of thefundamental frequency of the sound. These voiced speech signalcomponents are periodic with a common fundamental period in all parts ofthe frequency spectrum. Noise and unvoiced speech sounds normally have amore continuous, nonharmonic frequency structure. Even when noise doescontain discrete frequency constituents, these constituents rarely havea single fundamental period.

Field of the invention The ability to distinguish between voiced speechand noise or unvoiced speech is important in many communication systems.In voice operated equipment and in automatic microphone directingsystems, the ability to distinguish speech from noise quickly andautomatically is particularly critical. Similarly, the coding schemesernployed in many vocoder speech communication systems require that thetransmitting station distinguish between voiced and unvoiced speechportions before coding.

DESCRIPTION OF THE PRIOR ART In vocoder systems, the separation ofvoiced and unvoiced speech portions is often accomplished lby dividing aspeech signal into two spectral regions and subtracting the signal inthe high frequency region from the low frequency signal. Since a voicedsound has its predominant energy components centered in the lowfrequency portion of the spectrum and an unvoiced sound is composed ofpredominantly higher frequency energy, this difference is generallypositive when voiced energy is present and 3,507,999 Patented Apr'. 21,1970 negative otherwise. However, such processing is not adequate fordistinguishing speech in a high level variable ambient noise environmentsince it is sensitive to relative amplitude differences in the spectralcomponents. Further, such systems do not indicate the relative strengthof speech energy present. Other methods of voiced speech identificationincluding time domain analysis by autocorrelation have been employed buthave not proved entirely satisfactory.

SUMMARY OF THE INVENTION Consequently, it is an object of the presentinvention to detect the presence and relative intensity of a speechsignal in a high level variable ambient noise environment.

In accordance with the present invention, two signals are derived fromthe signal to be analyzed. The first is a reference signal and isrepresentative of the power of the received signal. The reference signalis derived by dividing the signal to be analyzed into two contiguousfrequency subbands, extracting signals proportional to the power in eachsubband, and multiplying these signals t0- gether.

The second is a comparison signal and is proportional to the degree ofcommon periodicity between the high and low frequency components of thereceived signal. The comparison signal is derived by processing thesignal in either one of the subband channels by a periodicity-preservingnonlinear process to produce a signal with frequency components whichoverlap the components of the signal in the other subband channel. Thetwo overlapping subband signals are then multiplied together in such away as to compensate for phase differences between them. The resultingproduct, averaged over a period of time, is the comparison signal.

The reference and comparison signals are compared in an appropriatecomparator or threshold detector to determine the presence or absence ofspeech or the proportion of speech in the input.

BRIEF DESCRIPTION OF THE DRAWING The invention will be fully apprehendedfrom the following description of an illustrative embodiment thereof,taken in conjunction with the appended drawing which is a blockschematic diagram of apparatus for detecting the presence of speech in avariable ambi-ent noise environment.

In accordance with a preferred embodiment of the invention, andreferring to the drawing, an acoustic wave is convert-ed to anelectrical analog signal Iby transducer 8. To determine whether or notspeech is present in this analog signal, the signal is applied tobandpass filtering networks 12 and 13 by means of signal channels 9 and10.

Bandpass filter 12 is adjusted to pass all signal components below aselected frequency and block all others and is constructed to providetwo outputs in quadrature phase, at terminals 12a and 12b. The design ofsuch filtering networks is well-known. Bandpass filtering network 13 isadjusted to pass all signal components -above the selected frequency.This separation of the input signal into two contiguous frequencysubbands is a first step in the comparison of the low and high frequencyenergy components of the input signal. When the comparison of thesecomponents reveals that they are periodic with a common fundamentalperiod, the presence of speech is indicated. A reference for suchcomparison is derived as follows:

A first output of network 12, at 12b in the drawing, is delivered to theinput of squaring network 14. Net- Work 14 is of well-known design andproduces a signal at its output terminal proportional to the square ofthe signal at its input. The output of squaring network 14 is applied toaveraging network 15 which produces an output signal proportional to theinput signal averaged over a time 3 period. This squaring and averagingprocess produces a signal proportional to the signal power in the outputof network 12.

The output of BPF network 13 is similarly applied to a squaring network,16, and then applied to averaging network 17. Output`signals fromaveraging networks 17 and 15 are applied to multiplying network 18.Network 18 produces a reference signal proportional to the product ofthe power in the signal passed by filter 12 and the power in the signalpassed by filter 13. This reference signal is applied to comparatornetwork 19 for comparison with a comparison signal.

To derive a comparison signal which is representative of the degree ofcommon periodicity between the signal in filters 13 and 14, the outputof BPF network 13 is applied to envelope detecting network 20 whichproduces a signal proportional to the envelope of the applied signal.Frequency components contained in this envelope signal overlap thecomponents of the signal passed by filter 12. The product of thesesignals will accentuate common periodicities between them. The output ofenvelope detecting network 20 is applied to multiplying network 21together with the first output of BPF network 12. Multiplying network 21produces a first product signal proportional to the product of the'applied signals. This first product signal is applied to averagingnetwork 22 which provides an output proportional to the first productsignal averaged over a time period. By applying this averaged signal tosquaring network 23 ya signal proportional to the square of the averagedsignal is produced. This averaged and squared first product signal isapplied to summing network 24 for addition to a second product signal.

The second product signal, which is required to compensate for phasedifferences between the low and high frequency components of thereceived signal, is formed as follows. The second output 12a of bandpassfiltering network 12, `which carries a signal 90 degrees out of phasewith the signal on 12b, and the output of envelope detecting network 20are applied to product network 25 which produces a signal proportionalto the product of these two signals. This signal is applied to averagingnetwork 26 and then to squaring network 27, which have the same effectas networks 22 and 23. The resulting averaged and squared signal isapplied to summing network 24, the output of which is proportional tothe sum of the two signals applied to it. This output is `directed tocomparator circuit 19 for comparison with the reference signal.

Comparator 19 may, in one illustrative embodiment, be designed toprovide a binary output. In this case, when the comparison signal is aspecified increment larger than the reference signal or when the ratioof the two signals exceeds a given ratio, a voiced signal in the inputis indicated, for example, by a binary one signal in output channel 30.When the comparison signal differs from the reference signal by lessthan the specified increment or the ratio falls below the given ratio,the absence of voiced signal components in the input signal isindicated, for example, by a binary zero signal in output channel 30. Inan alternative embodiment, the comparator may provide -a signal inchannel 30 proportional to the ratio of the comparison signal to thereference signal. Such a signal represents the quantity of speech energypresent at the transducer 8. Comparator circuits of either constructionare well-known in the art.

It should be apparent from the foregoing discussion that the comparatorsoperation is independent of relative changes in the amplitudes of thehigh and low frequency components of the speech-noise signal. Forexample, if the amplitude of the high frequency component of the inputsignal increases by a factor of 2 and the amplitude of the low frequencysignal component increases by a factor of 3 the reference signalincreases by a factor of 36. Under these circumstances, the comparisonsignal similarly increases by a factor of 36 thereby facilitating thedetection of speech in the presence of a variable noise signal.

In a variation of the above embodiment of the invention, the inputsignal to be analyzed may be divided into more than two contiguousfrequency subbands. In this case, each subband is paired with anotherand each subband pair is processed as described above. The criterion fordetermining the presence of speech will depend on the application inwhich the invention is employed. An example of such criterion is asfollows: If any preselected number of subband pairs indicates thepresence of speech, then speech is considered to be present in the inputsignal.

It is to be understood that the above-described arrangements are merelyillustrative of application of the principles of the invention. Otherarrangements may be devised by those skilled in the art withoutdeparting from the spirit and scope of the invention.

What is claimed is:

1. A speech discriminator comprising, means for separating an inputsignal into a plurality of contiguous frequency subband signals, aplurality of signal channels each supplied with one of said subbandsignals, means for deriving a reference signal representative of thesignals in a rst and a second of said channels, means for deriving acomparison signal representative of the degree of common periodicitybetween the signals in said first and said second signal channels, meansfor comparing said comparison signal with said reference signal.

2. A speech discriminator as defined in claim 1 wherein the signal insaid first signal channel is lower in frequency than the signal is saidsecond signal channel, and wherein said means for deriving saidcomparison signal includes means for deriving an envelope signalproportional to the envelope of the signal in said second signalchannel, and means for combining said envelope signal with the signal insaid first signal channel.

3. A speech discriminator as defined in claim 1 -wherein said means forderiving a comparison signal includes nonlinear, periodicity preservingmeans for processing said signal in a selected one of said signalchannels.

4. A speech discriminator as defined in claim 3 wherein said means forderiving a comparison signal further includes means for deriving a firstsignal proportional to the product of said processed signal in saidselected signal channel and said signal in a second selected signalchannel, means for deriving a phase shifted signal proportional to thesignal in said second selected signal channel shifted in phase, meansfor deriving a second signal proportional to the squared and timeaveraged product of said phase shifted signal and said signal in saidselected signal channel, and means for deriving a sum signalproportional to the sum of said first and said second signals.

5. Apparatus for identifying the presence of speech in an analog signalcomprising, means for deriving a first signal proportional to the lowfrequency components of an input signal, means for deriving a secondsignal proportional to the envelope of the high frequency components ofan input signal, means for deriving a reference signal indicative of thepower in said input signal, means for compensating for phase differencesbetween said lirst and said second signal, means for deriving a productsignal proportional to the product of said first signal and said secondsignal, means for deriving a threshold measure from said referencesignal, means for comparing said reference signal with said thresholdquantity.

- 6. A speech discriminator comprising, means for separating an inputsignal into two contiguous frequency subband signals, first and secondsignal channels each provided with one of said subband signals, meansfor deriving a signal proportional to the signal in said first channelshifted in phase' by 90 degrees, means for deriving an envelope signalproportional to the envelope of the signal in said second signalchannel, means for deriving a first product signal proportional to theproduct of said rst subband signal and said envelope signal, means forderiving a second product signal proportional to the product of saidenvelope signal and said phase shifted signal, means for deriving athird signal proportional to the time average of the square of said rstproduct signal, means for deriving a fourth signal proportional to thetime average of the square of said second product signal, adder meansfor deriving a sum signal proportional to the sum of said third signaland said fourth signal, means for deriving a reference signal related tosaid first and second subband signals, means for comparing saidreference signal with said sum signal, means for indicating the resultof said comparison.

7. A speech discriminator as dened in claim 6 Wherein said means forderiving a reference signal comprises, means for deriving a first powersignal proportional to the power of said first subband signal, means forderiving a second power signal proportional to the power of said secondsubband signal, and multiplier means for multiplying said rst powersignal with said second power signal.

8. A speech discriminator as defined in claim 6 wherein, said means forcomparing said reference signal with said sum signal comprises acomparator which provides output signals proportional to the ratio ofsaid sum signal and said comparison signal.

9. A speech discriminator as defined in claim 6 wherein said means forcomparing said reference signal with said sum signal comprises acomparator which provides a binary output.

References Cited UNlTED STATES PATENTS 3,238,303 3/1966 Dersch 179-1WILLIAM C. COOPER, Primary Examiner 20 D. W. OLMS, Assistant Examiner

